{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/integrations/openai/beyond_search_webinar/02_pinecone-code-demo.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/integrations/openai/beyond_search_webinar/02_pinecone-code-demo.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "with open('data/mapping.json', 'r') as fp:\n",
    "    mappings = json.load(fp)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openai\n",
    "from openai.embeddings_utils import get_embedding\n",
    "\n",
    "openai.api_key = 'OPENAI_API_KEY'  # platform.openai.com/login/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now create embeddings with OpenAIs embedding models like so:\n",
    "\n",
    "```python\n",
    "q_embeddings = get_embedding(\n",
    "    'how to use gradient tape in tensorflow',\n",
    "    engine=f'text-search-curie-query-001'\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We initialize our connection to Pinecone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pinecone import Pinecone\n",
    "\n",
    "def load_index():\n",
    "    pinecone.init(\n",
    "        api_key='PINECONE_API_KEY',  # app.pinecone.io\n",
    "        environment=\"YOUR_ENV\"  # find next to API key in console\n",
    "    )\n",
    "\n",
    "    index_name = 'beyond-search-openai'\n",
    "\n",
    "    if not index_name in pinecone.list_indexes().names():\n",
    "        raise KeyError(f\"Index '{index_name}' does not exist.\")\n",
    "\n",
    "    return pinecone.Index(index_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "index = load_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a function that will use OpenAI to create a query embedding, then use it to retrieve the most relevant context embeddings from Pinecone. These contexts are appended into a larger string ready for feeding into OpenAIs next generation step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openai\n",
    "from openai.embeddings_utils import get_embedding\n",
    "\n",
    "def create_context(question, index, max_len=3750, size=\"curie\"):\n",
    "    \"\"\"\n",
    "    Find most relevant context for a question via Pinecone search\n",
    "    \"\"\"\n",
    "    q_embed = get_embedding(question, engine=f'text-search-{size}-query-001')\n",
    "    res = index.query(vector=q_embed, top_k=5, include_metadata=True)\n",
    "    \n",
    "\n",
    "    cur_len = 0\n",
    "    contexts = []\n",
    "\n",
    "    for row in res['matches']:\n",
    "        text = mappings[row['id']]\n",
    "        cur_len += row['metadata']['n_tokens'] + 4\n",
    "        if cur_len < max_len:\n",
    "            contexts.append(text)\n",
    "        else:\n",
    "            cur_len -= row['metadata']['n_tokens'] + 4\n",
    "            if max_len - cur_len < 200:\n",
    "                break\n",
    "    return \"\\n\\n###\\n\\n\".join(contexts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test context retrieval..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Topic: tensorflow - General Discussion; Question: Tape Gradient C++ - Hi, Would like to use GradientTape in c++ but cannot find any code sample that demonstrates how to do so. In python: with GradientTape as tape: tape.watch(x) y = x * x In C++: ? Does anyone have some sample to share? Best, Dom; Answer: You can find something at:   github.com   tensorflow/tensorflow/blob/master/tensorflow/c/eager/gradients_test.cc 16  /* Copyright 2020 The TensorFlow Authors. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at  http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ==============================================================================*/ #include \"tensorflow/c/eager/gradients.h\" #include <memory> #include \"absl/container/flat_hash_set.h\" #include \"absl/types/span.h\"  This file has been truncated. show original\\n\\n###\\n\\nTopic: tensorflow - General Discussion; Question: GradientTape on eager mode - GradientTape cannot compute the gredients for the model. How can I debug this code? class Training(keras.Model): def __init__(self, model):  super(Training, self).__init__()  self.model = model def compute_loss(self, texts, labels):  texts = tf.math.l2_normalize(texts, axis=0)  losses = tf.Variable(tf.zeros_like(labels, dtype=tf.float32), trainable=True, dtype=tf.float32)  for index, label in enumerate(labels):  pos_pairs = texts[labels == label]  neg_pairs = texts[labels != label]  if len(pos_pairs) > 1:   p_list = tf.Variable( tf.zeros(pos_pairs.shape[0], dtype=tf.float32), trainable=True, dtype=tf.float32 )   i = 0   for pos_pair in pos_pairs:   p_list[i].assign( keras.losses.cosine_similarity(texts[index], pos_pair) )   i += 1   p_list = tf.exp( p_list )   p_list = p_list / tf.reduce_sum(p_list)   p_loss = tf.reduce_sum( - tf.math.log( p_list ) )  else:   p_loss = 0.0  if len(neg_pairs) > 1:   n_list = tf.Variable( tf.zeros(neg_pairs.shape[0], dtype=tf.float32), trainable=True, dtype=tf.float32 )   i = 0   for neg_pair in neg_pairs:   n_list[i].assign( keras.losses.cosine_similarity(texts[index], neg_pair) )   i += 1   n_list = tf.exp( n_list )   n_list = n_list / tf.reduce_sum(n_list)   n_loss = tf.reduce_sum( tf.math.log( n_list ) )  else:   n_loss = 0.0     loss_on_sentence = p_loss + n_loss  losses[index].assign( loss_on_sentence )   loss = tf.reduce_mean(losses)  return loss def train_step(self, data):  texts = data[0]  labels = data[1]  #print(labels, texts)  with tf.GradientTape() as tape:  texts = self.model(texts)  loss = self.compute_loss(texts, labels)  print(loss)  trainable_vars = self.trainable_variables  #print(trainable_vars)  gradients = tape.gradient(loss, trainable_vars)  print(gradients)  self.optimizer.apply_gradients(zip(gradients, trainable_vars))  loss_tracker.update_state(loss)  return {\"loss\": loss_tracker.result()} @property def metrics(self):  return [loss_tracker] trainer = Training(model) trainer.compile(optimizer=\\'adam\\', run_eagerly=True) trainer.fit(train_dataset, callbacks=[tensorboard_callback]) Error: ValueError: No gradients provided for any variable: [\\'dense1/kernel:0\\', \\'dense1/bias:0\\', \\'dense2/kernel:0\\', \\'dense2/bias:0\\', \\'bn1/gamma:0\\', \\'bn1/beta:0\\', \\'dense3/kernel:0\\', \\'dense3/bias:0\\', \\'dense4/kernel:0\\', \\'dense4/bias:0\\', \\'bn2/gamma:0\\', \\'bn2/beta:0\\', \\'dense5/kernel:0\\', \\'dense5/bias:0\\', \\'dense6/kernel:0\\', \\'dense6/bias:0\\', \\'bn3/gamma:0\\', \\'bn3/beta:0\\', \\'dense7/kernel:0\\', \\'dense7/bias:0\\', \\'dense8/kernel:0\\', \\'dense8/bias:0\\', \\'bn4/gamma:0\\', \\'bn4/beta:0\\', \\'dense9/kernel:0\\', \\'dense9/bias:0\\', \\'bn5/gamma:0\\', \\'bn5/beta:0\\']. Can anyone help to debug this error!! Thanking you in advance.; Answer: Which part breaks the gradient flow? This type of code works in pytorch but here it breaks flow somewhere and I cannot figure out which part cause the problem.\\n\\n###\\n\\nTopic: tensorflow - General Discussion; Question: Custom loss function - “Shapes of all inputs must match” - Hi y’all… continuing the saga from my previous post. I’ve included code snippets for your viewing at the bottom. Let me know if you need more! Currently, I’m trying to build out a GradientTape with just some integers I obtained from a custom loss function. It seems like it’s trying to find the gradient for multiple variables at once, as I had to change the GradientTape to persistent, or I got the following error: RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients (or jacobians) This workaround necessitates that I would manually delete the GradientTape later, which of course I’m not the biggest fan of… Thanks for reading and take care! ======================================================= Some output : model_2: print loss_value_tensor in get grad f’n(48,) model_2: print x shape in compute loss f’n(48, 28, 28, 1) model_2: print y shape in compute loss f’n(48,) model_2: print loss_value shape in compute loss f’n() … And that will make sense when you see the code. Error was: Shapes of all inputs must match: values[0].shape = [3,3,1,8] != values[1].shape = [8] [Op:Pack] name: initial_value ======================================================= Here’s the code snippets I think you’d need to diagnose the problem. The matrices could be filled out with dummy data, since we’re just worried about the numpy array shapes, and the tensor shapes:   gist.github.com   https://gist.github.com/erick016/898ebb309ec9cfaae314aaef8ff0e385 3 loss_fns #Training replacement d = 0 #from algorithm 2, used in computing loss #alpha = .33 #learning rate #def nt_grad(curr_d):  #return -1*alpha*(curr_d/BATCH_SIZE) def compute_loss(model, x, y, training):  out = model(x, training=training)  This file has been truncated. show original           gist.github.com   https://gist.github.com/erick016/fbf91402b5b743a9d3f1467d081c165b 1 bigL.py  print(\"Reached Big L.\")   for batch_idx in range(NUM_BATCHES):   #print(\"epoch_overall_loss_per_batch Index:\" + str(epoch * NUM_BATCHES + batch_idx))   print(\"batch_idx:\" + str(batch_idx))   #print(\"train_loss_per_batch_L1 Shape:\" + str(np.shape(train_loss_per_batch_L1)))   #print(\"train_loss_per_batch_L2 Shape:\" + str(np.shape(train_loss_per_batch_L2)))   #print(\"train_loss_per_batch_L3 Shape:\" + str(np.shape(train_loss_per_batch_L3)))   #print(\"epoch_overall_loss_per_batch Shape:\" + str(np.shape(train_loss_per_batch_L3)))   #print(\"===================================\")  This file has been truncated. show original; Answer: Relevant article I’m looking at:   stackoverflow.com       How to fix tensorflow \"InvalidArgumentError: Shapes of all inputs must match\" 14  python, tensorflow, machine-learning, keras  asked by   Madison Sheridan  on 10:41PM - 22 Jul 19 UTC        The actual custom optimizer I’m using:   gist.github.com   https://gist.github.com/erick016/30567f54946cf9e2804db2ab10da5dd4 1 custom_optimizer.py #!/usr/bin/env python # coding: utf-8 # In[1]: #momentum \"m_hat\" and gradient \"g_hat\" # In[2]:  This file has been truncated. show original\\n\\n###\\n\\nTopic: tensorflow - General Discussion; Question: How to fix Random seed in Gradient tape - How to fix random seed to calculate gradient (gradient tape)? [Environment] OS: Windows 10 Tensorflow 2.6.0 [Issue] I show GradCAM image like this sample code.     keras.io    Keras documentation: Grad-CAM class activation visualization 1        It uses GradientTape to get gradient, and the result is different each time in spite of same source image and logit values of inference. So heatmap image is different in each time with this issue. I fixed random seed like below, but I couldn’t fix random seed for gradient tape. import os import random os.environ[‘TF_DETERMINISTIC_OPS’] = ‘1’ SEED = 42 random.seed(SEED) np.random.seed(SEED) tf.random.set_seed(SEED) os.environ[“PYTHONHASHSEED”] = str(SEED) Calc gradient and Get heatmap like below: def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None): with tf.GradientTape() as tape: last_conv_layer_output, preds = grad_model(img_array)#this preds’ value is same in every time with fix random seed if pred_index is None: pred_index = tf.argmax(preds[0]) class_channel = preds[:, pred_index] # This is the gradient of the output neuron (top predicted or chosen) # with regard to the output feature map of the last conv layer grads = tape.gradient(class_channel, last_conv_layer_output); Answer: Bhack, I solved this problem! The cause is I set “input_tensor” when I make a model. I can get a deterministic heatmap not to set input_tensor!\\n\\n###\\n\\nTopic: tensorflow - General Discussion; Question: Multi-GPU doesn’t work for model(inputs) nor when computing the gradients - Hi, When using multiple GPUs to perform inference on a model (e.g. the call method: model(inputs)) and calculate its gradients, the machine only uses one GPU, leaving the rest idle. For example in this code snippet below: import tensorflow as tf import numpy as np import os # Make the tf-data path_filename_records = \\'your_path_to_records\\' bs = 128 dataset = tf.data.TFRecordDataset(path_filename_records) dataset = (dataset   .map(parse_record, num_parallel_calls=tf.data.experimental.AUTOTUNE)   .batch(bs)   .prefetch(tf.data.experimental.AUTOTUNE)   ) # Load model trained using MirroredStrategy path_to_resnet = \\'your_path_to_resnet\\' mirrored_strategy = tf.distribute.MirroredStrategy() with mirrored_strategy.scope():  resnet50 = tf.keras.models.load_model(path_to_resnet) for pre_images, true_label in dataset:  with tf.GradientTape() as tape:  tape.watch(pre_images)  outputs = resnet50(pre_images)  grads = tape.gradient(outputs, pre_images) Only one GPU is used. You can profile the behavior of the GPUs with nvidia-smi. I don’t know if it is supposed to be like this, both the model(inputs) and tape.gradient to not have multi-GPU support. But if it is, then it’s a big problem because if you have a large dataset and need to calculate the gradients with respect to the inputs (e.g. interpretability porpuses) it might take days with one GPU. Another thing I tried was using model.predict() but this isn’t possible with tf.GradientTape.; Answer: Here’s a working example of it. I ask to whoever has more than one GPU to try and check if the issue still remains. Notebook: Working Example.ipynb 8 Saved Model: HDF5'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "create_context(\"how do I use gradient tapes in tensorflow\", index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can move onto answering a question. This step will take a query, encode it, retrieve other contexts (as done above), and then pass them onto OpenAIs generation model within a specific format that can be modified via the `instruction` parameter below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def answer_question(\n",
    "    index,\n",
    "    fine_tuned_qa_model=\"text-davinci-002\",\n",
    "    question=\"Am I allowed to publish model outputs to Twitter, without a human review?\",\n",
    "    instruction=\"Answer the question based on the context below, and if the question can't be answered based on the context, say \\\"I don't know\\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\nQuestion: {1}\\nAnswer:\",\n",
    "    max_len=3550,\n",
    "    size=\"curie\",\n",
    "    debug=False,\n",
    "    max_tokens=400,\n",
    "    stop_sequence=None,\n",
    "    domains=[\"huggingface\", \"tensorflow\", \"streamlit\", \"pytorch\"],\n",
    "):\n",
    "    \"\"\"\n",
    "    Answer a question based on the most similar context from the dataframe texts\n",
    "    \"\"\"\n",
    "    context = create_context(\n",
    "        question,\n",
    "        index,\n",
    "        max_len=max_len,\n",
    "        size=size,\n",
    "    )\n",
    "    if debug:\n",
    "        print(\"Context:\\n\" + context)\n",
    "        print(\"\\n\\n\")\n",
    "    try:\n",
    "        # fine-tuned models requires model parameter, whereas other models require engine parameter\n",
    "        model_param = (\n",
    "            {\"model\": fine_tuned_qa_model}\n",
    "            if \":\" in fine_tuned_qa_model\n",
    "            and fine_tuned_qa_model.split(\":\")[1].startswith(\"ft\")\n",
    "            else {\"engine\": fine_tuned_qa_model}\n",
    "        )\n",
    "        #print(instruction.format(context, question))\n",
    "        response = openai.Completion.create(\n",
    "            prompt=instruction.format(context, question),\n",
    "            temperature=0,\n",
    "            max_tokens=max_tokens,\n",
    "            top_p=1,\n",
    "            frequency_penalty=0,\n",
    "            presence_penalty=0,\n",
    "            stop=stop_sequence,\n",
    "            **model_param,\n",
    "        )\n",
    "        return response[\"choices\"][0][\"text\"].strip()\n",
    "    except Exception as e:\n",
    "        print(e)\n",
    "        return \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's initialize a few query/instruction formats..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "instructions = {\n",
    "    \"conservative Q&A\": \"Answer the question based on the context below, and if the question can't be answered based on the context, say \\\"I don't know\\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\nQuestion: {1}\\nAnswer:\",\n",
    "    \"paragraph about a question\":\"Write a paragraph, addressing the question, and use the text below to obtain relevant information\\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\nQuestion: {1}\\nParagraph long Answer:\",\n",
    "    \"bullet point\": \"Write a bullet point list of possible answers, addressing the question, and use the text below to obtain relevant information\\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\nQuestion: {1}\\nBullet point Answer:\",\n",
    "    \"summarize problems given a topic\": \"Write a summary of the problems addressed by the questions below\\\"\\n\\n{0}\\n\\n---\\n\\n\",\n",
    "    \"extract key libraries and tools\": \"Write a list of libraries and tools present in the context below\\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\n\",\n",
    "    \"just instruction\": \"{1} given the common questions and answers below \\n\\n{0}\\n\\n---\\n\\n\",\n",
    "    \"summarize\": \"Write an elaborate, paragraph long summary about \\\"{1}\\\" given the questions and answers from a public forum on this topic\\n\\n{0}\\n\\n---\\n\\nSummary:\",\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default we use *\"conservative Q&A\"* which returns `\"I don't know\"` when unsure of the answer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"I don't know.\""
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "answer_question(index, question=\"What are GPT-2's strengths and weaknesses?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try and ask a few more questions with different instructions..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The question is about how to finetune the CLIPModel further on their own dataset. The person tried using the default data_collator with the Trainer, but it didn't work.\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"OpenAI CLIP\", \n",
    "                            instruction = instructions[\"summarize\"], debug=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The questions above address the problems of using OpenAI's CLIP model for image search and style transfer, as well as how to load the model onto a GPU. Additionally, the question about training a CLIP-like model for German language raises the challenge of modifying existing models to add projection layers.\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"OpenAI CLIP\", \n",
    "                            instruction = instructions[\"summarize problems given a topic\"], debug=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "- Huggingface Transformers\n",
      "- Bert\n",
      "- MarkupLM\n",
      "- HTML\n",
      "- Kiela et all.\n",
      "- UMAP\n",
      "- HDBSCAN\n",
      "- T5\n",
      "- BERT-base\n",
      "- GPT-2\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"embedding models, which embed images and text\", \n",
    "                            instruction = instructions[\"extract key libraries and tools\"], debug=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tensorflow and Pytorch are both major machine learning libraries. Tensorflow is maintained and released by Google while Pytorch is maintained and released by Facebook.\n",
      "\n",
      "Tensorflow is more convenient in the industry (prototyping, deployment and scalability is easier) and PyTorch more handy in research (its more pythonic and it is easier to implement complex stuff).\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"Compare and contrast Tensorflow and Pytorch\", \n",
    "                            instruction = instructions[\"just instruction\"], debug=False,\n",
    "                            domains=[ \"tensorflow\", \"pytorch\"]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "- One way to summarize text is to use extractive summarization, where you choose the top k sentences from the text.\n",
      "- Another way is to use abstractive summarization, where you generate a summary of the text that is shorter than the original text.\n",
      "- You can also combine the two methods, using extractive summarization to choose the sentences you want to include in the summary, and then using abstractive summarization to generate the summary itself.\n",
      "- Finally, you can also use successive abstractive summarization, where you summarize the text in chunks, and then use those chunks to generate a summary of the desired length.\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"What are some of the ways to summarize text?\", \n",
    "                            instruction = instructions[\"bullet point\"], debug=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Some of the common problems of trying to run GPT-J 6B yourself include:\n",
      "- Not being able to find the model on the Hugging Face Hub\n",
      "- Getting a KeyError when trying to download the checkpoint\n",
      "- Not being able to get the model working with the latest transformers version\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"What are some of the common problems of trying to run GPT-J 6B yourself?\", \n",
    "                            instruction = instructions[\"paragraph about a question\"], debug=False))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "There are a few ways to convert TensorFlow code into PyTorch. One way is to use the Open Neural Network Exchange (ONNX) format. ONNX is a format that allows models to be transferred between different frameworks. To convert a TensorFlow model to PyTorch using ONNX, you can use the onnx-tensorflow converter.\n",
      "\n",
      "Another way to convert TensorFlow code is to use the PyTorch converter. The PyTorch converter is a tool that converts TensorFlow models into PyTorch models. The converter is still in beta, but it should be able to convert most TensorFlow models into PyTorch models.\n",
      "\n",
      "Finally, you can also convert TensorFlow code into PyTorch code manually. This is usually not recommended, as it can be quite difficult to get the code to work correctly. However, if you are familiar with both frameworks, it may be possible to convert the code manually.\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"How can I convert tensorflow code into pytorch?\", \n",
    "                            instruction = instructions[\"paragraph about a question\"], debug=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. OpenAI is headquartered in San Francisco, California. OpenAI\\'s goal is to \"advance digital intelligence in the way that is most likely to benefit humanity as a whole\".[1] Since its founding, OpenAI has worked on a number of projects involving machine learning and artificial intelligence.\\n\\nIn 2015, OpenAI released an open source artificial intelligence software called Universe.[2] Universe allows any computer program to be used as a potential environment for training artificial intelligence agents.\\n\\nIn 2016, OpenAI released an open source machine learning library called OpenAI Gym.[3] OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.\\n\\nIn 2017, OpenAI released an open source artificial intelligence software called OpenAI Baselines.[4] OpenAI Baselines is a set of high-quality implementations of reinforcement learning algorithms.\\n\\nIn 2018, OpenAI released an open source machine learning library called TensorFlow Agents.[5] TensorFlow Agents is a library for reinforcement learning in TensorFlow.\\n\\nIn 2019, OpenAI released an open source machine learning library called JAX.[6] JAX is a library for machine learning research that uses automatic differentiation and XLA.'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "answer_question(index, question=\"What are the open source models released by OpenAI?\", \n",
    "                            instruction = instructions[\"paragraph about a question\"], debug=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "I don't know.\n"
     ]
    }
   ],
   "source": [
    "print(answer_question(index, question=\"How can I use embeddings to visualize my data?\"))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once you're finished with the index, delete it to save resources:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pinecone.delete_index(index_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13 (default, Mar 28 2022, 06:59:08) [MSC v.1916 64 bit (AMD64)]"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "5fe10bf018ef3e697f9035d60bf60847932a12bface18908407fd371fe880db9"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
